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Army  Research  Office  Grant  #  DAAH04-93-G-0453  has  supported  the  purchase  of  24  additional 
compute  nodes  that  were  installed  in  the  Intel  iPS  C/860  hypercube  at  the  University  of  Kentucky  (UK), 
rendering  a  32-node  multiprocessor  computer.  This  facility  has  allowed  the  investigators  to  explore  and 
extend  the  boundaries  of  electromagnetic  simulation  for  important  areas  of  Defense  concerns  including 
microwave  monolithic  integrated  circuit  (MMIQ  design/analysis  and  electromagnetic  materials  research  and 
development  The  iPSC/860  has  provided  an  ideal  platform  for  MMIC  circuit  simulations.  A  number  of 
parallel  methods  based  on  direct  tune-domain  solutions  of  Maxwell’s  equations  have  been  developed  on  the 
iPSC/860,  including  a  parallel  finite-difference  time-domain  (FDTD)  algorithm,  and  a  parallel  planar 
generalized  Yee-algotithm  (PGY).  The  iPSC/860  has  also  provided  an  ideal  platform  on  which  to  develop  a 
“virtual  laboratory*  to  numerically  analyze,  scientifically  study  and  develop  new  types  of  materials  with 
hwirfif  ial  oWtfnmagnMir  properties.  These  materials  simulations  are  capable  of  assembling  hundreds  of 
microscopic  inclusions  from  which  an  electromagnetic  full-wave  solution  will  be  obtained  in  toto.  This 
prwwfni  simulation  tool  has  enabled  research  of  the  full-wave  analysis  of  complex  multicomponent  MMIC 
devices  and  the  electromagnetic  properties  of  many  types  of  materials  to  be  performed  rwmerically  rather 
than  strictly  in  the  laboratory. 


14.  SUBJECT  TERMS 

iPSC/860  hypercube,  MMIC,  parallel  algorithms,  FDTD,  PGY, 
materials  characterization,  chiral  media,  bianisotropic 


”  SUgR^UASSJNtATWN 
UNCLASSIFIED 


NSN  7$NM)Wt0-mi 


HIGH  PERFORMANCE  ELECTROMAGNETIC 
SIMULATION  TOOLS 


Final  report 

Stephen  d.  Gedney  and  Keith  w.  Whites 


October  27,  1994 

U.S.  ARMY  RESEARCH  OFFICE 

Grant  #  DAAHO4-93-G-0453 
DEPARTMENT  OF  ELECTRICAL  ENGINEERING 

University  of  Kentucky 


APPROVED  FOR  PUBLIC  RELEASE 
Distribution  unlimited 


TABLE  OF  CONTENTS 


SECTION _  PAGE 


1.  STATEMENT  OF  PROBLEMS  STUDIED . 1 

2.  SUMMARY  OF  RESULTS . 2 

2.1  Introduction . 5 

2.2.  Full  Wave  Analysis  of  Microwave  Monolithic  Integrated 

Circuits  -  S.  Gedney . 4 

2.2.1  Parallel  FDTD  Algorithm . 4 

2.2.1  Parallel  PGY  Algorithm . 5 

2.2.3  Results .  11 

2.2.4  References .  17 

2.3.  Electromagnetic  Properties  of  Materials 

-  K.  W.  Whites . , .  18 

2.3.1  Solution  Methodology .  18 

2.3.2  Results . 21 

2. 3. 2.  a  Partial  Verification  of  the  Material  Model  Method . 21 

2.3.2. b  Free  space  host,  lossless  inclusions  results . 23 

2.3.2. C  Lossless  host  and  inclusions  results . 26 

2.3.2. d  Lossy  host,  lossless  inclusions  results . 26 

2.3.2. e  Free -space  host,  lossy  inclusions  results . 28 

2.3.3  References . 30 

3.  PUBLICATIONS . 31 

3.1  Journal  Papers  Published/Submitted . 31 

3.1.1  S.  Gedney . 31 

3.1.2  K.  Whites . 31 

3.1  Conference  Papers  Published/Presented . 31 

3.2.1  S.  Gedney . 31 

32.2  K.  Whites . 32 

4.  SUPPORTING  PERSONNEL . 32 

5.  REPORT  OF  INVENTIONS . 32 


1.  STATEMENT  OF  PROBLEMS  STUDIED 


Army  Research  Office  Grant  #  DAAHO4-93-G-0453  has  supported  the  purchase  of  24 
additional  compute  nodes  that  were  installed  in  the  Intel  iPS  C/860  hypercube  at  die  University  of 
Kentucky  (UK).  This  has  resulted  in  a  32-node  iPSC/860,  providing  nearly  a  GFLOPS  of 
computing  power,  512  MB  of  core  memory  and  6  GB  of  concurrent  I/O.  This  facility  has  allowed 
(be  investigators  to  explore  and  extend  die  boundaries  of  electromagnetic  simulation  for  important 
areas  of  Defense  concerns  including  microwave  monolithic  integrated  circuit  (MMIC) 
design/analysis  and  electromagnetic  materials  research  and  development. 

The  32-node  iPSC/860  hypercube  has  provided  an  ideal  platform  for  MMIC  circuit  simulations. 
A  number  of  parallel  methods  based  on  direct  time-domain  solutions  of  Maxwell’s  equations  have 
been  developed  on  the  iPSC/860,  including  a  parallel  finite-difference  time-domain  (FDTD) 
algorithm,  and  a  parallel  planar  generalized  Yee-algorithm  (PGY).  With  the  development  of  these 
techniques,  this  powerful  platform  has  enabled  the  full- wave  analysis  of  complex  multicomponent 
MMIC  devices  for  a  number  of  applications. 

The  iPSC/860  has  also  provided  an  ideal  platform  on  which  to  develop  a  “virtual  laboratory”  to 
numerically  analyze,  scientifically  study  and  develop  new  types  of  materials  with  beneficial 
electromagnetic  properties.  These  materials  simulations  are  capable  of  assembling  hundreds  of 
microscopic  inclusions  from  which  an  electromagnetic  full-wave  solution  will  be  obtained  in  toto. 
For  sparse  distributions  of  inclusions,  this  numerical  laboratory  is  able  to  accommodate  literally 
thousands  of  inclusions.  This  powerful  simulation  tool  has  enabled  research  into  the 
electromagnetic  properties  of  many  types  of  materials  to  be  performed  numerically  rather  than 
strictly  in  the  laboratory. 


,lacr«salon  For 

"kHs  6RA&I 
me  TAB 


1 


2.  SUMMARY  OF  RESULTS 

2.1  Introduction 

The  32-node  iPSC/860  hypercube  has  provides  a  high  performance  simulation  platform  that  has 
significantly  benefited  two  distinct  research  projects  currently  ongoing  at  the  University  of 
Kentucky:  1)  The  full-wave  analysis  of  microwave  monolithic  integrated  circuits  (MMICs),  and 
2)  electromagnetic  materials  research  and  development 

In  the  last  decade,  MMICs  have  played  a  leading  role  in  the  design  of  microwave 
communications  systems.  As  the  fabrication  technologies  of  these  devices  have  matured,  both 
military  and  commercial  applications  have  demanded  smaller  and  denser  MMICs  operating  over 
larger  handwidths  and  at  higher  operating  frequencies  (into  the  Ka  and  U  bands).  This  has  lead  to 
an  increased  need  for  more  rigorous  analyses  that  explicitly  model  the  electromagnetic  interactions 
within  these  devices.  Although,  it  is  instructive  to  isolate  each  element  of  the  MMIC  and  analyze 
them  independently,  proper  characterization  of  densely  packaged  circuits  require  the  rigorous 
analysis  of  entire  circuits  simultaneously,  or  large  blocks  of  circuits.  To  this  end,  coupling 
between  devices  that  can  significantly  perturb  die  circuit  operation  can  be  accounted  for. 

The  objective  of  this  research  has  been  to  complete  the  development  of  a  MMIC  circuit  simulator 
based  on  a  direct  time -domain  solution  of  Maxwell's  equations  and  parallel  algorithms.  The 
simulator  is  capable  of  analyzing  MMICs  using  either  the  traditional  finite-difference  time-domain 
(FDTD)  method  or  the  robust  planar  generalized  Yee-algorithm  (PGY).  The  simulator  is  capable 
of  analyzing  passive  devices  with  lumped  loads,  as  well  as  active  (nonlinear)  devices.  To  model 
highly  complex  circuits  using  the  generalized  Yee-algorithm,  the  simulator  has  been  interfaced  with 
the  computer  aided  design  (CAD)  system  SDRC  IDEAS  for  modeling  the  circuits  and  performing 
the  mesh  generation  for  its  discrete  representation  and  a  graphics  package  for  visualization  and 
analysis  of  results.  With  this  potential,  models  consisting  of  upwards  to  80  X  106  degrees  of 
freedom  have  been  analyzed  using  the  FDTD  algorithm  (on  a  regular  grid),  and  upwards  to  60  x 
106  degrees  of  freedom  have  been  analyzed  using  the  PGY  algorithm  on  the  32-node  iPSC/860. 
This  has  provided  a  powerful  computational  laboratory  and  has  facilitated  ongoing  research  of  the 
design  and  development  of  MMIC  circuits  for  microwave  communications  in  the  upper  GHz 
frequency  bands. 

The  second  area  of  focus  for  this  high  performance  simulation  platform  is  one  of  today’s  most 
dynamic  areas  of  physical  science  —  materials  research.  Using  sophisticated  manufacturing 
techniques  and  the  availability  of  new  types  of  molecules,  it  is  possible  to  build  new  exotic 
materials  having  unusual  properties.  The  additional  compute  nodes  which  have  expanded  the 
University  of  Kentucky  Intel  iPSC/860  have  been  used  for  numerically  constructing  a  “virtual 
laboratory”  in  which  the  electromagnetic  properties  of  new  types  of  materials  have  been 
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investigated.  Currently,  an  ongoing  research  program  is  focusing  on  the  determination  of  the 
macroscopic  constitutive  parameters  for  an  interesting  class  of  substance  called  chiral  materials. 
The  modifier  “chiral,”  from  the  Greek  word  meaning  handed,  refers  to  the  nature  of  the  inclusions 
which  constitute  the  material  at  the  microscopic  level.  The  impetus  for  this  research  is  that  on  the 
macroscopic  level,  these  materials  have  been  shown  to  provide  unusual  and  sometimes  very 
.  beneficial  behavior  in  such  applications  as  radar  absorbent  materials  and  guided-wave  applications. 
For  wave  absoiption,  understanding  and  modeling  the  interactions  of  the  inclusions,  the  host 
material  and  the  incident  wave  at  the  microscopic  level  is  necessary  to  both  develop  a  macroscopic 
model  as  well  as  to  design  and  optimize  this  behavior. 

In  macroscopic  chiral  studies  which  have  appeared  in  the  literature,  the  constitutive  parameters 
describing  the  macroscopic  behavior  of  the  chiral  substance  are  assumed  known  from  which 
parametric  studies  are  performed.  In  reality  these  parameters  are  not  known  and  to  the  author’s 
knowledge  no  other  comprehensive  treatment  has  been  given  to  their  determination.  Using 
numerical  simulations,  a  technique  to  compute  these  constitutive  parameters  for  effectively  chiral 
materials  has  been  developed  and  results  indicate  that  such  a  procedure  is  attainable.  As  would  be 
expected  such  a  numerical  simulation  is  very  costly  in  terms  of  computational  resources. 
Additionally,  this  research  program  seeks  to  analyze  other  types  of  materials,  not  only  chiral.  With 
this  in  mind,  the  algorithms  and  procedures  which  have  been  developed  were  designed  with  this 
generality  in  focus. 
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2.2.  Full  Wave  Analysis  of  Microwave  Monolithic  Integrated 
Circuits  -  S.  Gedney 

2.2.1  Parallel  FDTD  Algorithm 


The  FDTD  method  is  based  on  the  discretization  of  Maxwell's  curl  equations 

£  ~ + =  V  x  // 
dt 


(2.1a) 

(2.1b) 


using  central  difference  approximations  of  time  and  spatial  derivatives.  To  this  end,  the  vector 
fields  are  projected  onto  the  edges  of  a  regular  orthogonal  dual,  staggered  grid.  Based  on  this 
discretization,  the  curl  equations  can  be  expressed  in  a  discrete  form  as  [1] 

J7"+i/2  =  HH-m  +  AhEn  (2.2) 

£"+1  =  DtEn  +  AtHH+m  (2.3) 

where  H  and  E  are  vectors  of  coefficients  representing  the  vector  field  intensities  on  the  grid 
edges,  die  superscript  n  represents  the  time  increment,  De  is  a  diagonal  matrix,  and  Ae  and  A/,  are 
sparse  matrices.  The  entries  of  these  matrices  are  found  in  [1].  Equations  (2.2)  and  (2.3)  provide 
an  explicit  time-marching  solution  for  the  electric  and  magnetic  field  intensities  within  the  discrete 
volume.  By  staggering  the  electric  and  magnetic  fields  in  both  space  and  time,  the  solution  is 
second-order  accurate  in  both  space  and  time  (assuming  uniform  grids)  providing  the  time  step 
satisfies  the  stability  criterion 

A t< 


The  FDTD  algorithm  has  the  principle  advantage  that  since  the  grid  is  regular  and  orthogonal,  De, 
Ae  and  Ah  are  regular  and  never  need  be  stored  explicitly,  but  are  easily  computed  each  iteration. 
This  results  in  an  extremely  memory  efficient  algorithm. 

The  parallel  FDTD  algorithm  is  based  on  a  spatial  decomposition  of  the  regular  grid  structure. 
To  this  end,  the  original  domain  is  spatially  decomposed  into  contiguous  sub-domains  which  are 
rectangular  in  shape,  non-overlapping,  sharing  common  surfaces  only,  and  are  of  equal  size  (see 
Fig.  2.1).  The  boundaries,  or  surfaces,  shared  by  sub-domains  are  chosen  by  taking  slices  along 
edges  of  the  primary  grid.  Each  subdomain  is  then  mapped  onto  independent  processors.  Due  to 
file  regularity  of  the  grid,  die  work  load  is  easily  balanced  between  processors. 

Figure  2.1  illustrates  the  shared  surface  between  two  contiguous  sub-domains  with  a  shared 
surface  in  an  x  =  constant  plane.  The  arrows  represent  the  tangential  electric  fields  on  the  shared 
surface,  and  the  dots  represent  the  normal  magnetic  field  vectors.  The  tangential  electric  fields 
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A z2 


(2.4) 
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Fig.  2.1.  Shared  interface  between  two  sub-domains  along  the  y-z  plane.  The  arrows 
represent  the  shared  electric  field  vectors  tangential  to  the  shared  boundary 
and  the  dots  represent  the  normal  magnetic  field  vectors  normal  to  the  shared 
boundary. 

ey^!w  811(1  eT^n  “Pda*6*1  using  (2.3)and  is  dependent  on  the  magnetic  field  vectors  normal  to 

the  four  faces  sharing  these  edges.  However,  each  processor  has  at  most  three  of  the  four  faces 
needed  to  update  the  electric  field  in  local  memory,  and  it  becomes  necessary  to  retrieve  the  needed 
data  from  adjacent  processors. 

The  magnetic  field  vectors  normal  to  the  shared  interface  are  updated  using  (2.2).  The  update  of 
die  magnetic  field  passing  through  a  face  is  proportional  to  the  curl  of  the  electric  field  about  the 
edges  bounding  the  face.  Since  each  processor  has  the  updated  value  of  the  tangential  electric 
fields  on  the  shared  interface,  the  normal  magnetic  field  can  be  updated  independently  on  each 
processor.  Therefore,  interprocessor  communication  is  not  needed  when  updating  the  magnetic 
fields  within  each  sub-domain.  Rather,  it  is  much  more  expedient  to  simply  update  the  normal 
magnetic  fields  in  the  shared  boundary  redundantly  on  each  processor  sharing  the  face. 

2.2.7  Parallel PGY Algorithm 

The  PGY  algorithm  is  based  on  the  discretization  of  Ampere's  and  Faraday's  laws  in  their 
integral  form  [2, 3]  by  projecting  die  vector  fields  onto  the  edges  of  a  dual,  staggered  grid.  Unlike 
die  FDTD  algorithm,  the  grid  is  assumed  to  be  unstructured  and  irregular.  The  PGY  algorithm  is 
specifically  applied  to  structures  which  have  planar  symmetry,  namely  three-dimensional 
geometries  that  can  be  uniquely  described  by  a  projection  onto  a  two-dimensional  plane.  For 
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example  a  micro  strip  circuit  printed  on  a  planar  dielectric  substrate  backed  by  a  ground  plane  (not 
necessarily  infinite)  can  be  said  to  have  planar  symmetry.  By  exploiting  this  symmetry,  the  three- 
dimensional  grid  can  then  be  described  by  a  two-dimensional  grid  which  is  extruded  into  the  third- 
dimension  in  a  regular  sense  [2].  The  principal  advantage  of  this  assumption  is  that  only  the  two- 
dimensional  grid  need  be  stored,  and  die  update  matrices,  similar  to  (2.2)  and  (2.3),  need  only  be 
stoned  for  one  layer  of  cells.  This  leads  to  a  tremendous  savings  in  memory  overhead.  Finally,  the 
method  is  fully  explicit  and  has  been  shown  to  be  a  highly  scalable  algorithm  [2, 3]. 

The  planar  generalized  Yee-algorithm  is  based  on  a  direct  solution  of  the  time-dependent 
Maxwell's  equations  in  their  integral  form.  The  electric  and  magnetic  field  intensities  are  initially 
normalized  as 


e  =  E!jno 


(2.5) 


where  tjq  is  the  characteristic  wave  impedance  in  free  space.  Faraday's  law  and  Ampere's  law  are 
then  expressed  in  their  integral  form  as 

je<a  =  -~§v,hdS  (2.6) 

jhdl=~§ere.dS  +  rto§aedS  (2.7) 

where  c0  is  the  free  space  velocity  of  light,  and  £r  arc  the  relative  permeability  and  permittivity, 
respectively,  and  a  is  the  absolute  conductivity.  The  principle  advantage  of  this  normalization  is 
that  the  magnitudes  of  e  and  h  will  be  of  the  same  order,  reducing  rounding  error.  Furthermore, 
it  is  much  more  convenient  to  work  with  the  relative  permittivity  and  permeabilities  rather  than  their 
absolute  values. 

Faraday's  and  Ampere’s  laws  are  expressed  in  a  discrete  form  by  mapping  e  and  h  into  a 
discrete  three-dimensional  space.  The  mapping  consists  of  projecting  the  vector  fields  onto  the 
edges  of  a  dual  grid,  composed  of  two  staggered  grids,  referred  to  as  the  primary  and  secondary 
grids.  Each  grid  is  a  three-dimensional  grid  that  is  described  as  being  regular  along  the  vertical 
direction  (assumed  to  be  the  z-direction),  and  is  unstructured  in  the  horizontal  direction. 
Conceptually,  this  grid  can  be  generated  by  extruding  a  two-dimensional  unstructured  grid  in  the 
vertical  direction,  and  segmenting  it  at  discrete  heights,  as  illustrated  in  Fig.  2.2.  The  secondary 
*nd  is  staggered  within  the  primary  grid  such  that  its  vertices  lie  at  the  centroids  of  the  primary  grid 
cells,  and  the  edges  of  the  secondary  grid  connect  the  centroids  by  passing  through  tire  faces  of  the 
primary  grid. 

The  electric  and  magnetic  fields  are  then  decomposed  into  orthogonal  components 
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Fig.  2.2  An  example  of  the  primary  grid  described  by  similar  two-dimensional 
unstructured  grids  cascated  in  the  vertical  z-direction  in  a  regular  sense. 
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Subsequently,  the  transverse  electric  and  magnetic  fields  are  mapped  onto  the  horizontal  edges  of 
the  primary  and  secondary  grids,  respectively.  Likewise,  the  vertical  electric  and  magnetic  fields 
are  mapped  onto  the  vertical  edges  of  the  primary  and  secondary  grids,  respectively.  The  vector 
fields  are  assumed  to  be  constant  along  their  respective  edge  lengths,  as  well  as  over  the  dual  face 
through  which  they  pass. 

Based  on  the  above  discretization,  Faraday's  and  Ampere's  laws  are  then  mapped  into  the 
discrete  space.  The  time  derivative  is  then  approximated  using  a  central  difference  expression, 
which  is  second-order  accurate  if  the  fields  are  staggered  in  time.  This  leads  to  [2] 

1 


K*x  (k)  =  h",  (k)  - 


5X 2  me, 


(2.9) 


f  ,4 


\dz+(e^(k  +  \)-e'*Hk) 


l, 

J  J 
(2.10) 
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(2.11) 


(2.12) 


where  df  and  bt  are  the  flux  densities  in  the  transverse  plane,  At  is  the  time  increment,  n  is  the  time 
index,  k  is  the  index  along  z,  Ap  and  As  are  the  areas  of  the  primary  and  secondary  grid  faces, 
respectively,  Npi  and  Nsj  are  the  number  of  edges  bounding  the  i-th  primary  and  the  j-th  secondary 
grid  faces,  respectively,  and  the  £{  are  the  length  vectors  of  the  primary  or  secondary  grid  edges. 
The  material  parameters  £/-,  fir  and  a  are  assumed  to  be  piecewise  homogeneous  in  both  the  z- 
direction  as  well  as  in  the  transverse  direction.  At  the  interface  of  two  unlike  medium,  the 
parameters  are  assigned  an  average  value,  as  described  in  Appendix  A  of  [4]. 

Based  on  (2.6)  and  (2.7),  it  is  recognized  that  the  flux  densities  updated  in  (2.10)  and  (2.1 1)  are 
normal  to  the  faces.  However,  the  corresponding  field  intensities  on  the  dual  edges  passing 
through  these  faces  are  not  necessarily  normal  to  the  faces.  As  a  result,  the  flux  densities  must  be 
projected  onto  the  edges  before  the  dual  fields  can  be  updated.  Since  only  one  component  of  the 
field  is  locally  known,  an  auxiliary  operator  must  be  introduced  to  perform  the  projection.  To  this 
end,  the  projection  operators  implemented  in  [4]  are  used  to  project  the  fields  onto  the  dual  edge 
passing  through  the  face. 

Let  Nf  be  the  normal  area  vector  to  a  primary  grid  transverse  face,  and  s  be  the  unit  vector 
along  the  dual  grid  edge  passing  through  the  face  (Fig.  2.3).  Using  (2.10)  the  magnetic  flux 
densities  projected  onto  the  normals  of  all  the  primary  grid  transverse  faces  are  updated. 
Subsequently,  for  each  face,  a  general  flux  density  vector  b  is  introduced.  From  (2.10),  b-NP  is 
known  at  each  edge.  In  Fig.  2.3,  the  edge  identified  is  bound  by  vertices  1  and  2,  which  is 
identified  by  the  index  1  =  12-  Each  vertex  is  also  shared  by  two  additional  edges  which  share  a 
common  cell.  Let  j  represent  one  of  these  edges,  where  j  - 12-  The  normal  area  vector  to  the  jf-th 
edge  associated  with  the  i-th  vertex  is  ND  j .  Subsequently,  we  define  a  general  flux  density  vector 

bi  j  to  be  the  local  estimate  of  the  magnetic  flux  vector  associated  with  the  i-th  vertex  and  the  ;-th 
edge,  where  bitj  is  computed  by  solving  the  two-dimensional  system  of  equations 
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Fig.  2.3  Normal  to  a  transverse  primary  face,  and  a  dual  edge  passing  through  the  face. 


biJ-ND=bND 

krK  =*•<. 


(2.13) 


where  the  right-hand-side  is  known  from  (2.10).  Subsequently,  introducing  the  weighting 
coefficient  wi  J  =  |z  •  (ND  x  ND^  )|,  the  flux  density  projected  onto  the  dual  edge  is  expressed  as 


2  2 
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(2.14) 


The  field  updates  are  then  computed  using  (2.10)-(2.14).  However,  it  is  realized  that 
computing  the  parameters  for  these  equations  requires  a  significant  number  of  floating  point 
operations,  leading  to  a  highly  inefficient  algorithm.  However,  by  employing  standard  finite- 
element  type  techniques,  the  computational  efficiency  can  be  greatly  enhanced  by  treating  these 
linear  operators  as  sparse  matrices.  To  this  end,  (2.10)-(2.14)  can  be  expressed  in  reduced  form 
as 


=[*;],+ a,  [*r,nL 

T£/i+1/21 
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[Drmi=D,prml+\ 
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(2.15) 

(2.16) 

(2.17) 

(2.18) 

(2.19) 
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(2.20) 


[erm]k= \,[orm  L 


where  the  subscript  k  refers  to  the  discrete  height  along  the  z -direction,  Dt  and  Bt  are  the  flux 
densities,  the  D  's  are  diagonal  matrices,  and  the  A 's  are  sparse  matrices.  Note  that  these  matrices 
are  only  associated  with  the  two-dimensional  grid  since  they  are  the  same  for  all  values  of  k 
(inhomogeneities  in  material  parameters  are  easily  built  into  these  expressions).  As  a  result,  the 
additional  memory  required  to  store  these  matrices  is  nominal. 

Finally,  the  field  updates  are  performed  using  (2.15H2.20),  providing  an  extremely  efficient 
computational  technique  that  is  second-order  accurate  for  the  simulation  of  the  time-varying  fields. 
Furthermore,  the  solution  is  stable,  providing  [4] 


A t< 


(2.21) 


where  is  the  minimum  edge  length  in  the  model. 

Once  the  matrices  are  computed,  the  field  updates  reduce  to  a  series  of  sparse -matrix  -  vector 
multiplications,  which  is  readily  parallelizable.  The  parallel  PGY  algorithm  is  based  on  a  spatial 
decomposition  of  the  three-dimensional  grid  into  contiguous,  non-overlapping  subdomains.  This 
spatial  decomposition,  or  partitioning,  of  the  three-dimensional  unstructured  mesh  is  being 
performed  by  first  partitioning  the  two-dimensional  unstructured  mesh  using  the  Recursive  Inertia 
Partitioning  (RIP)  algorithm  [5]  and  a  Greedy  algorithm  [6],  and  then  using  a  trivial  partitioning 
scheme  along  the  regular  dimension  of  the  grid  which  is  similar  to  the  FDTD  partition. 

Once  the  mesh  is  decomposed  into  subdomains,  a  single  subdomain  is  assigned  to  each 
processor.  The  matrices  in  (4)  are  then  expressed  as  a  subassembly  of  matrices,  where  each  sub 
matrix  represents  the  updates  of  the  fields  within  each  subdomain.  Subsequently,  the  matrix- 
vector  products  on  each  processor  are  computed  as 

AUw  -  ■  •  '  N -hared  f  ^shared  \ 

(2.22) 


-  -i*  ^“(-shared  \  shared  \ 

AiXj  —  Ai  X;  +  2^  [Ai,j  Xj  J  +  2]  ^x[  Ajj  X j  J 

7=1  V  J  7=1  V  J 


where  Ai  are  the  entries  of  A,  associated  with  all  field  vectors  internal  to  the  i-th  processor's 

.  mmshared  k 

subdomain,  Aij  are  the  entries  of  Ai  associated  with  all  field  vectors  in  the  i-th  processor's 
subdomain  that  lie  on  the  boundary  shared  with  the  y'-th  domain,  N shared  is  the  number  of 
processors  that  share  boundaries  with  the  i-th  processor,  and  Rx  is  the  receive  operator,  receiving 
die  vector  of  data  from  the  y-th  processor.  The  first  two  expressions  on  the  right-hand-side  of 
(2.22)  are  done  completely  in  parallel  on  each  processor,  and  the  final  term  requires  interprocessor 
communication.  If  the  subdomains  are  equal  in  size,  then  the  work  load  required  to  perform  the 
matrix-vector  products  will  be  well  balanced.  If  the  number  of  edges  shared  by  sub  domains  are 
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minimized,  the  lengths  of  die  interprocessor  communications  will  be  minimal  (note  that  the  number 
of  sub  domains  that  each  sub  domain  has  interfaces  with  should  be  minimized  as  well). 
Furthermore,  due  to  the  regularity  of  the  grid,  the  matrix  vector  multiplications  It  was  shown  in 
[3]  that  this  leads  to  a  highly  efficient  parallel  algorithm. 

22  J  Results 

*"?.  'Ofe.-oi 

The  parallel  algorithms  have  been  implemented  in  FORTRAN  on  the  32-node  iPSC/860 
hypercube.  Care  has  been  taken  to  optimize  the  codes  for  floating  point  and  memory  efficiency. 
For  example  vectorization  is  exploited  in  the  inner  loops  of  the  computationally  intensive  tasks  for 
each  algorithm.  Furthermore,  full  optimization  was  utilized  when  compiling  the  programs. 

In  each  of  the  algorithms,  unbounded  media  are  modeled  through  the  use  of  an  absorbing 
boundary  condition.  The  second-order  dispersive  boundary  condition  [7]  was  implemented  to 
update  the  tangential  fields  on  the  exterior  truncation  boundary. 

The  expected  storage  requirements  and  the  number  of  floating  point  operations  required  for  each 
algorithm  is  presented  in  Table  2.1.  In  this  table,  Nx,  Ny,  and  Nz,  represent  the  number  of  lattice 
cells  along  the  x,  y,  and  z-directions  for  the  FDTD  algorithm.  Ne,  Nc,  and  N„  represent  the 
number  of  edges,  cells,  and  nodes  in  the  two-dimensional  unstructured  meshes  used  by  the  PGY 
algorithm,  and  Nx  represents  the  number  of  cells  along  the  vertical  z-direction.  Neo  is  the  number 
of  edges  in  the  mesh  which  will  require  a  projection  operation  for  the  PGY-algorithm  (in  regions 
where  the  mesh  is  orthogonal,  the  projection  is  not  needed).  Typically,  for  quadrilateral  meshes, 
Ne  *  2 Nc  «  2 Nn .  For  a  uniform  mesh,  it  can  be  assumed  that  Nc  =  NxNy .  By  exploiting  planar 

symmetry,  the  memory  needed  to  store  the  PGY  matrices  is  quite  negligible  compared  to  the 
memory  required  to  store  the  three-dimensional  fields.  As  a  result,  the  PGY-algorithm  requires 
only  ~  25  %  more  storage  than  the  FDTD  algorithm.  However,  for  problems  with  complex 
geometries,  the  number  of  quadrilateral  cells  needed  to  accurately  model  the  geometry  is  expected 
to  be  much  smaller  than  that  required  by  the  FDTD  method,  and  this  deficiency  can  be  easily 
accounted  for.  The  floating  point  operations  required  by  the  PGY  algorithm  and  the  FDTD 
algorithm  for  an  orthogonal  mesh  of  equivalent  size  are  exactly  the  same.  Nonorthogonal  meshes 
introduce  additional  computations  for  the  PGY-algorithm  due  to  the  projection  operators. 


Table  2.1 

Storage  and  Floating  Point  Requirements  of  Each  Algorithm 


Storage 
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Fig.  2.4  Low  pass  microstrip  filter  printed  on  a  dielectric  substrate. 

Furthermore,  the  vector  lengths  of  the  FDTD  algorithm  will  typically  be  longer,  leading  to 
improved  floating  point  speeds.  Thus  it  is  expected  that  the  FDTD  algorithm  will  be  faster 
computationally. 

Benchmark  comparisons  of  the  two  codes  are  provided  here  through  a  simple  test  case,  which  is 
the  microstrip  low-pass  filter  illustrated  in  Fig.  2.4.  This  problem  has  been  previously  modeled 
using  the  FDTD  algorithm  [8],  and  conforms  to  a  regular  grid.  To  this  end,  the  FDTD  grid  is 
defined  by  an  80  x  1 10  X  16  lattice.  Unstructured  grids  were  also  generated  to  model  this  problem 
for  the  PGY  algorithm.  The  grids  were  created  to  achieve  roughly  the  same  level  of  accuracy  as 
the  FDTD  algorithm.  To  this  end,  the  PGY  mesh  consisted  of  8006  quadrilateral  cells,  and  8160 
nodes  by  16  cells  in  the  z-direction.  The  mesh  was  generated  using  SDRC  IDEAS,  and  was 
partitioned  using  the  RIP  partitioning  scheme.  This  task  was  performed  on  an  HP-workstation. 

The  microstrip  line  was  excited  by  a  voltage  source  with  a  Gaussian  pulse  wave  form,  which 
had  a  30  GHz  bandwidth.  In  each  case  4000  time  iterations  were  used  to  reach  a  converged 
solution.  Initially,  keeping  the  problem  size  fixed,  the  problem  was  executed  on  a  number  of 
processors  varying  from  1  to  32.  The  CPU  times  required  to  perform  the  4000  time  iterations  are 
recorded  in  Table  2.2.  Clearly,  the  FDTD  algorithm  is  the  most  computationally  efficient  - 
especially  on  the  uniprocessor  system,  where  it  ran  in  roughly  1/3  of  the  time  required  by  the  PGY 
algorithm.  This  is  due  to  the  reduction  in  floating  point  operations  and  the  fact  that  higher  floating 
point  speeds  are  realized  by  the  FDTD  algorithm  since  the  vector  lengths  are  much  longer. 
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Table  2.2 

CPU  Times  Recorded  .vs.  #  of  Processors  (P) 
(Wall  Clock  Time  (seconds)  -  iPSC/860) 
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Table  2.3. 

Parallel  Efficiency  .vs.  #  of  Processors  (P) 
Based  on  a  Fixed  Speedup  Study 
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Table  2.3  presents  the  parallel  efficiency  of  the  algorithms  based  on  the  fixed  speedup  study 
presented  in  Table  2.2.  Interestingly,  the  FDTD  algorithm  only  records  a  speedup  of  53  %  over  32 
processors.  This  is  somewhat  deceiving,  however,  since  the  MFLOPS  rate  of  each  processor  has 
been  reduced  by  64  %  due  to  the  shorter  vector  lengths!  If  we  scale  the  computational  time  by  this 
amount,  the  efficiency  would  be  83  %  (i.e.,  a  speedup  of  26.5  over  32  processors).  This  is  quite 
good,  as  expected  for  this  algorithm.  Nevertheless,  there  is  degradation  in  the  efficiency  due  to  the 
required  interprocessor  communication,  as  well  as  slight  load  imbalances  and  redundant 
computation.  The  PGY-algorithm  realizes  a  64  %  parallel  efficiency  (i.e.,  a  speedup  of  20.5  over 
32  processors).  Here,  additional  efficiency  is  principally  lost  due  to  load  imbalances  and 
interprocessor  communication. 

It  is  also  instructive  to  measure  the  scaled  speedup  of  the  algorithms.  Specifically,  rather  than 
keeping  the  problem  size  constant  and  increasing  the  number  of  processors,  a  scaled  speedup  study 
Increases  the  problem  size  with  the  number  of  processors.  This  study  was  done  on  the  512-node 
Intel  Delta.  Each  node  of  the  Intel  Delta  also  hosts  a  40  MHz  i860  RISC  processor,  with  only  12 
Mb  or  RAM.  Thus  the  same  floating  point  speeds  as  the  iPSC/860  arc  realized.  Furthermore,  the 
same  code  developed  on  the  iPSC/860  was  directly  ported  to  the  Delta.  Table  2.4  illustrates  the 
CPU  time  per  time  iteration  required  by  each  of  the  algorithms  as  the  problem  size  is  scaled  with 
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Table  2.4. 

Scaled  Speedup 

CPU  seconds  /  Time  Iteration  .vs.  #  of  Processors  (P) 
(Wall  Clock  Time  (seconds)  -  Intel  Delta) 
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the  number  of  processors.  Here  the  problem  size  is  scaled  by  increasing  the  mesh  size  in  the 
transverse  directions.  It  is  noticed  that  the  algorithms  scale  extremely  well,  and  excellent  parallel 
efficiencies  are  observed.  Slight  jumps  are  realized  as  the  problem  is  scaled  beyond  the 
uniprocessor  system  due  to  the  introduction  of  interprocessor  communication  and  serial 
computation.  Nevertheless,  it  can  be  concluded  that  the  CPU  time  require  for  a  problem  of  size  N 
on  1  processor  will  execute  in  roughly  the  same  amount  of  time  as  a  problem  of  size  N*P  on  P 
processors.  The  caveat  is  that  as  N  is  increased,  the  total  number  of  time  iterations  needed  to  reach 
a  steady  state  will  grow  roughly  as  0(jN  ). 

A  second  problem  is  presented  to  illustrate  the  effects  of  load  balancing  on  the  PGY  algorithm. 
To  this  end,  a  cylindrical  via  through  a  ground  plane  coupling  two  50  12  microstrip  lines  is 
presented.  This  is  illustrated  in  Fig.  2.5.  The  via  was  analyzed  on  a  32-node  iPSC/860.  The  two- 
dimensional  mesh  modeling  the  via  consisted  of  4867  quadrilateral  cells.  The  three-dimensional 
mesh  was  40  cells  high  along  the  vertical  direction.  The  mesh  was  spatially  decomposed  using 
both  the  RIP  and  the  Greedy  algorithms.  The  results  of  the  decompositions  are  illustrated  in  Table 
2.5  The  full  simulation  required  4000  time  iterations.  The  CPU  times  required  to  perform  the 
simulation  versus  the  number  of  processors  are  illustrated  in  Table  2.6,  comparing  the  times  that 
resulted  from  using  the  RIP  and  the  Greedy  algorithms.  Clearly,  the  Greedy  algorithm  results  in 
improved  CPU  times.  Figure  2.6  illustrates  the  speedups  of  the  parallel  algorithm,  again  based  on 
the  RIP  and  Greedy  decompositions.  The  speedup  here  is  defined  as  being  the  ratio  of  the  CPU 
time  required  to  execute  the  problem  on  P  processors  to  that  required  by  a  single  processor.  This 
is  also  compared  to  the  ideal  case  of  a  linear  speedup.  Excellent  speedups  are  observed  over  the  32 
processors.  Finally,  the  magnitude  of  the  S -parameters  are  illustrated  in  Fig.  2.7.  These  results 
compare  very  well  with  the  measured  results  presented  in  [9]. 
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Top  View 


Side  View 


0.7  mm 

Fig.  2.S  Geometry  of  a  cylindrical  via  through  a  PEC  ground  plane. 

Table  2.5. 

Load  Balance  of  the  RIP  and  Greedy  Algorithms 
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Table  2.6 

CPU  Times  Recorded  .vs.  #  of  Processors  (P)  on  iPSC/860 
for  the  Microstrip  Via  (4000  time  iterations) 
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Fig.  2.6  Magnitude  of  the  S-parameters  for  the  cylindrical  via  through  a  ground  plane  computed 
■sing  the  planar  generalized  Yee-algorithm. 


Fig.  2.7  Magnitude  of  the  S-parameters  for  the  cylindrical  via  through  a  ground  plane  computed 
using  the  planar  generalized  Yee-algorithm. 
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2.3.  Electromagnetic  Properties  of  Materials  -  K.  w.  Whites 

The  emphasis  of  this  research  is  on  the  computation  of  the  effective  constitutive  parameters  for 
synthetic  chiral  materials  composed  of  (lossy)  inclusions  distributed  in  a  (lossy)  host,  i.e. 
suspension,  material.  The  “synthetic”  attribute  arises  since  the  frequency  spectrum  under 
consideration  is  the  microwave  region  where  only  man-made  chiral  materials  are  known  to  exist. 

The  basis  for  computation  of  these  material  parameters  is  a  new  methodology  developed  by  the 
author  and  involves  two  basic  components.  The  first  is  purely  numerical  and  involves  a  Monte- 
Cario  method  to  compute  the  averaged  scattering  by  a  randomly  distributed  ensemble  of  inclusions 
which  are  constrained  to  always  lie  within  a  free-space  volume  of  canonical  shape.  In  this  work,  a 
spherical  region  is  used.  This  averaging  process  is  continued  until  convergence  is  obtained  in  the 
far-scattered  fields.  The  second  component  is  a  mixture  of  analytical  and  numerical  methods.  The 
analytical  part  is  to  obtain  a  solution  for  the  far-scattered  fields  of  the  canonically-shaped  volume 
with  constitutive  equations  of  a  presumed  form.  That  is,  a  continuum  model  is  applied.  The 
numerical  part  is  to  match  the  scattered  fields  computed  from  tire  Monte-Carlo  technique  with  those 
from  the  continuum  model  by  varying  the  constitutive  parameters.  This  is  a  difficult  task  due  to  the 
nonlinear  behavior  of  the  fields  as  a  function  of  the  constitutive  parameters.  This  technique  fully 
accounts  for  all  mutual  interactions  between  inclusions  as  well  as  self-interactions  -  no  analytical 
simplifying  models  are  used. 

In  the  physical  application  of  these  artificial  materials,  the  handed  inclusions  will  likely  be 
suspended  in  a  host  having  dielectric  and/or  magnetic  properties  different  than  those  of  free  space. 
Therefore,  quantification  of  the  interaction  between  the  inclusions  and  their  host  as  well  as  the 
effects  on  the  macroscopic  parameters  are  of  utmost  importance.  In  the  following  section,  the 
numerical  methodology  for  computing  these  effective  constitutive  parameters  will  be  outlined  with 
emphasis  on  the  wire-class  inclusion  geometry.  Using  this  technique,  the  host  material  effects  on 
the  macroscopic  constitutive  parameters  will  be  shown  for  both  lossless  and  lossy  host  materials. 

2.3.1  Solution  Methodology 

The  composite  chiral  materials  under  investigation  in  this  research  effort  are  composed  of  many 
thin-wire  helices  which  are  randomly  oriented  and  distributed  within  a  host  which  is  assumed  to  be 
ample,  homogeneous  and,  perhaps,  lossy  in  the  sense  that  both  er  and  nr  are  imaginary.  The 
constitutive  model  chosen  to  describe  this  “artificial”  material  is  the  Drude,  Bom,  Fedcrov  form 
Z)  =  £E  +  e/JV  x  E 

H  =  0+HfNxH  (31) 

The  computation  of  the  effective  constitutive  parameters  for  a  composite  material  is  a  non-trivial 
task  due  to  the  complex  nature  of  this  structure.  As  described  in  [1]  for  a  free-space  host  and 
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lossless  inclusions,  this  difficulty  can  be  overcome  by  applying  a  Monte-Carlo  numerical  solution 
together  with  an  analytical  solution  for  the  scattering  by  a  chiral  material  sphere.  This  technique  is 
inherently  numerical  in  nature  and  is  computationally  intensive.  However,  the  effective 
constitutive  parameters  in  (3.1)  can  be  quantified  for  any  thin-wire  shaped  inclusion. 

All  of  the  inclusion  shapes  to  be  analyzed  in  this  summary  are  composed  thin  wires,  such  as  a 
helix.  Using  die  Resistive  Tube  Boundary  Condition  (RTBC)  [2],  an  accurate  and  numerically 
efficient  simulation  for  material  wires  can  be  obtained  using  a  modified  electric  field  integral 
equation  (TEFLE)  for  die  vector  current  7.  Assuming  that  the  length  of  the  wire  inclusion  is  much 
greater  than  die  diameter  of  the  wire,  the  EFIE  is 


t  ■  E‘  =  ^  i  •  7(F) + jkru  ■  A  (F ) + 1  •  VO,  (F) 


(3.2) 


where  t  is  a  unit  tangent  vector  at  any  point  on  all  inclusions  and  the  superscript  /  indicates  the 
incident  electric  field  assuming  aneJ“  time  dependence.  In  (3.2), 

rri-  Pit? 


A(r)  =  I{r')*G^{r\r')  and  <D<(r)  =  -^-^*G^(r Jr') 

£ 

where  the  current  and  line  charge  are  convolved  with  the  smooth,  thin-wire  kernel 


(3.3,3.4) 


AvR* 


(3.5) 


and  R’t  is  taken  as  the  distance  from  a  point  on  the  axis  of  any  wire  (source  point)  to  another  point 
on  the  surface  of  any  wire  (observation  point).  The  complex  surface  resistivity  Rs  in  (3.2)  is  given 
as  [2] 

AM  Vd 


[JoM  VoMka)} 


(3.6) 


A  moment  method  (MM)  solution  is  formulated  following  the  work  of  Glisson  and  Wilton  [3] 
where  the  current  on  the  wires  is  expanded  in  a  set  of  N  piece- wise  triangular  basis  functions  with 
amplitudes  an  as 


N  f  r+  r-" 

Va  — — — 
^  "  ;+  /- 
««1  ln  . 


(3.7) 


with  the  definitions  of  the  quantities  in  (3.7)  shown  in  Figure  3.1.  The  resulting  discretized 
integro-differential  equation  is  tested  with  N  piece-wise  constant  functions,  each  extending  from 
the  centroid  of  one  segment  to  the  centroid  of  an  adjacent  segment  (contour  Cm  as  shown  in  Figure 
3.1),  to  form  the  N-by-N  constant  coefficient  matrix  equation 

[V_]  =  [Z„Ia.]  (3.8) 
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basis  function  n  associated 
with  this  node 


Fig.  3.1.  Geometry  relevant  to  the  numerical  scattering  solution  for  thin-wire  structures. 


where  Vm  =  ji(r)-E‘(r)dl  and  ZM  =  jki] J/(r)  A(r)d/  +  <I>r(^)-^e(rmc')  .  (3.9) 

c«  cm 

Hie  second  step  in  the  application  of  this  technique  is  to  compute  the  Monte-Carlo  averaged 
scattered  fields  for  an  ensemble  of  thin-wire  inclusions.  For  a  particular  distribution  of  thin-wire 
objects,  the  far-scattered  fields  can  be  found  from  the  Fourier  transform  of  the  vector  induced 
current  7,  for  i  =  0,<p,  as 

e; (0,<P)  ~ -jkT]g(r)  i  ■  {?[! (r ')]}  where  k  =  (ojv^e^,  T)  =  10) 

with 

g(r)  =  and  7[I(r'j\=  jf(r')e*rdl'  .  (3.11) 

wires 

In  (3.10),  it  is  presumed  that  the  currents  have  been  numerically  computed  using  the  formulation 
discussed  above.  Since  a  material  necessarily  implies  a  very  large  number  of  small  inclusions,  a 
“brute  force”  numerical  solution  would  be  cost  prohibitive. 

To  overcome  this  serious  constraint,  a  Monte-Carlo  method  is  applied  where  die  averaged,  far- 
•cattered  fields  are  computed  for  a  relatively  small  number  of  inclusions  using  the  “random 
redistribution  algorithm"  of  [1].  With  this  method,  a  plane  wave  is  assumed  incident  at  some  fixed 
angle  and  the  far-scattered  electric  fields  are  repeatedly  computed  using  (3.10)  for  many  different 
randomly  computed  arrangements  of  the  inclusions.  During  each  iteration,  the  inclusions  are 
constrained  to  always  lie  within  an  imaginary  spherical  region,  for  reasons  to  be  discussed  below. 
Using  all  of  these  far-scattered  fields,  an  arithmetic  mean  is  computed  at  each  angle  of  observation. 
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The  second,  and  final,  step  in  computing  the  effective  constitutive  parameters  for  these 
composite  chiral  materials  is  to  compare  the  Monte-Carlo  averaged  scattered  fields  with  those 
computed  analytically  for  a  chiral  material  sphere.  Such  an  analytical  solution  was  obtained  by 
Bohren  [4],  and  the  form  of  these  far-scattered  fields  is 


jAn  in  2 n  + 1 


k  tZn(n  + 1) 

Bg(r)Er*{e,<P) 

/  \  Ait  2n  +  1  ^ 

~g(0— 2,~7  ~TT 

jk  t*n(n  + 1) 


J>(1)  ' 

(a,  cos0  -  c„  sin0)-^ + (b„  cos  </>+dm  sin  (p)P^ 


•  pO)  ] 

(a„  sin  <f>  +  c„  cos  <fi)P^l)  +  (b„  sin  <P~dn  cos^-r— 

sin  0 


(3.12) 


(3.13) 


*g(r)E?*(  04) 

/ 

where =  P„(1)(cos0)  is  the  associated  Legendre  polynomial  of  the  first  kind,  and  />(1)  = 


dpp 

<te 


In  (3.12)  and  (3.13)  it  is  assumed  that  the  plane  wave  is  incident  in  the  z-direction  with  the  electric 
field  polarized  parallel  to  the  Jt-axis.  Therefore,  the  incident  field  in  the  numerical  solution  for 
(3.10)  will  also  have  this  restriction. 

An  analytical  computation  of  the  effective  constitutive  parameters  obtained  by  equating  the 
“measured”  scattered  fields  of  (3.10)  to  the  “computed”  fields  of  (3.12)  and  (3.13)  is  extremely 
complicated  due  to  the  nonlinear  dependence.  It  has  been  found  that  the  method  of  simulated 
annealing  is  a  cost  efficient  and  highly  accurate  method  for  finding  these  material  constants  [1]. 
Using  this  approach  a  energy  function,  E,  is  defined  as  the  difference  between  the  measured  and 
computed  scattered  fields  at  a  discrete  list  of/ observation  angles  (0,,$,) 


(3.14) 


and  subsequently  minimized  to  yield  the  effective  constitutive  parameters.  By  the  uniqueness 
theorem,  this  set  of  resulting  values  will  be  unique  [1]. 


2.3.2  Results 

2.3.2.a  Partial  Verification  nf  the  Material  Model  Method 

As  a  partial  verification  of  the  accuracy  of  the  numerical  codes  and  the  robustness  of  the 
algorithms  used  in  the  Monte-Carlo  method  for  numerically  modeling  artificial  materials,  a  test 
problem  composed  of  a  cloud  of  pec  rods  (shown  in  Figure  3.2)  will  be  analyzed.  Following  the 
development  in  [5],  an  analytical  approximation  for  the  electric  polarizability  of  a  single  pec  rod. 
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ar°,  can  be  determined  by  approximating  die  rod  as  a  truncated  biconical  structure  with  a  radius 
which  varies  as  a  function  of  distance  from  the  centroid  of  die  bicone.  For  a  rod  of  length  2 L  and 
radius  b  [S] 


The  end  effects  have  been  ignored  in  (3.15)  which,  assuming  thin  rods,  is  a  reasonable 
approximation.  However,  for  increasingly  large  radii  there  is  additional  charge  accumulation  on 
the  endcaps  of  die  rod  which  would  tend  to  increase  a® . 


Fig.  3.2.  Geometry  of  the  thin-wire  inclusion  shapes.  In  the  Monte-Carlo  simulations,  an 
imaginary  exclusion  sphere  of  radius  r^.  is  assumed  about  each  inclusion. 


With  an  incident  electric  field  linearly  polarized  parallel  to  a  small,  isolated  pec  rod,  the  induced 
electric  dipole  moment  p  in  terms  of  the  electric  field  is  given  as 

(3.16) 

for  the  rod  axis  pointing  in  the  rd  direction.  Using  (3.15),  an  effective  media  characterization  for  a 
cloud  of  randomly  distributed,  •on-interacting  pec  rods  can  easily  be  computed.  The  traditional 
definition  of  the  electric  dipole  moment  density  in  terms  of  N  (the  number  density),  the  spatial 
average  of  the  induced  electric  dipole  moment  and  the  electric  susceptibility  is  [6] 

T=N{p)=c,x£  (3.17) 
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where  p  is  given  in  (3.16).  Since  all  mutual  interaction  of  the  rods  is  ignored,  the  spatial  average 
in  (3.17)  can  alternatively  be  computed  by  averaging  (3.16)  over  all  orientation  angles  of  a  single 
rod  as 


_  |  f(64)pded<p 
{P  ~  jjm<t>)d6d<p 


(3.18) 


where  /is  some  weighting  function  and  6  and  </>  are  local  angle  variables.  Choosing /=  1  and 
noting  that  simply  flipping  the  rod  does  not  alter  p,  then 


imt/2 


(p)=±UpdM<p=^E~ 


(3.19) 


0  0 


From  (3.15),  (3.16)  and  (3.19),  the  average  polarizability,  at,  and  effective  electric  susceptibility, 
Xf ,  for  a  cloud  of  N  non-interacting  pec  rods  per  unit  volume  are  then 


^  Xf  = 


Na, 


(3.20) 


A  comparison  between  the  approximate  analytical  treatment  of  the  Rayleigh  scattering  by  a  cloud 
of  randomly  distributed  pec  rods  and  the  numerical  solution  method  of  this  paper  is  given  in  Figure 
3.3.  The  rods  all  have  dimensions  2 L  =  1.5  and  b  =  0.015  mm  and  are  randomly  oriented  and 
distributed  within  a  free-space  spherical  volume  of  radius  20  mm  at  a  frequency  of  1  GHz.  The 
analytical  results  use  (3.20)  while  the  numerical  solutions  are  computed  using  the  random 
redistribution  schedule  of  Section  II.  Also  shown  in  Figure  3.3  is  the  numerical  results  when  all 
mutual  interactions  between  inclusions  are  ignored.  In  this  case  the  separation  between  rods  is  no 
less  than  rwc  =  4L,  while  for  the  full- wave  solution  r^.  =  2 L.  The  fact  that  the  full- wave  results 
lie  between  the  other  two  is  primarily  due  to  a  smaller  basis  function  density  in  the  full-wave 
results. 


2.3.2.b  Free  space  host,  lossless  inclusions  results 

Using  the  methodology  developed  in  the  previous  section,  the  effective  chiral  material 
parameters  for  a  collection  of  pec  helical  inclusions  were  computed  and  are  shown  in  Figure  3.4 
for  the  frequency  range  1  to  13  GHz.  The  three-turn  helices  have  dimensions  a  =  P  =  0.5  mm  and 
b  =  0.05  mm  while  the  confutation  of  the  effective  constitutive  parameters  was  done  using  the 
random  redistribution  Monte-Guio  algorithm  for  a  collection  of  40  helices  in  a  free-space  volume 
with  r^f,  =  4.9  mm.  The  number  of  Monte-Carlo  iterations  required  for  convergence  varied  from 
10-20  at  low  frequencies  where  the  mutual  coupling  was  small,  to  40-60  at  higher  frequencies 
where  the  mutual  coupling  is  increasing. 
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Number  of  Inclusions 

Fig.  3.3.  Effective  media  electric  susceptibility  comparison  for  a  cloud  of  pec  rods  suspended  in 
a  tree-space  spherical  volume  of  radius  r^,h  =  20mm. 


12345678910111213 


f  (GHz) 

Fig.  3.4.  Effective  chiral  media  parameters  for  a  random  distribution  of  three-turn  pec  helices 

having  a-P  =  0.5, b  =  0.05  mm  and N  =  8.117  107/m3.  Forty  helices  suspended  in 
a  free-space  spherical  region  with  rsp^  =  4.9mm  were  used  in  the  simulation.  The 
symbols  ‘o’  and  ‘x’  designate  right-  and  left-handed  helices,  respectively. 
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A  32-node  Intel  iPSC®/860  hypercube,  together  with  the  Intel  ProSolver™-DES  library,  was  used 
to  generate  most  of  the  data  in  this  report  For  each  data  point  shown  in  Figure  3.4,  approximately 
8  node-hours  was  required  for  each  Monte-Carlo  iteration  (fill  and  solve  for  40  helices  and  60 
basis  functions  per  helix)  plus  an  additional  0.5  node-hours  for  the  SA  determination  of  the 
effective  constitutive  parameters.  The  integrals  appearing  in  (3.9)  were  computed  numerically 
suing  a  Romberg  integration  technique  which  results  in  most  of  the  8  node-hours  being  spent 
filling  the  Z-matrix  of  (3.9). 

Returning  to  Figure  3.4,  a  number  of  important  observations  can  be  made  which  strengthen  the 
argument  for  the  accuracy  of  these  effective  material  parameters.  The  first  observation  is  that  both 
e  and  n  are  virtually  identical,  at  the  same  frequency,  for  either  the  right-  or  left-handed  inclusions 
while  P  changes  sign.  These  results  are  consistent  with  the  invariant  scalar  nature  of  e  and  fi  and 
the  pseudoscalar  nature  of  p  [7].  Upon  a  spatial  inversion  of  the  coordinate  system,  the  field 
vectors  E  and  D  behave  as  true  vectors  while  B  and  H  transform  as  pseudovectors  [8]. 
Assuming  form  invariance  of  (3.1),  then  e  and  //  must  remain  unchanged  after  a  spatial  reflection 
(and  a  random  material),  while  P  must  change  sign. 

The  second  observation  from  the  results  in  Figure  3.4  is  that  the  magnitude  of  P  is  of  order  104 
m  while  the  magnitude  of  /i  is  approximately  one  over  the  entire  frequency  range  shown.  This 
same  order  of  magnitude  for  P  was  inferred  from  experimental  data  as  communicated  in  [9],  while 
in  [10]  p  and  n  values  were  obtained,  using  a  simplifying  analytical  model,  near  those  reported  in 
this  work.  It  is  noted  from  the  data  in  Figure  3.4  that  |£/1|<1,  where  k  &  (o-yJJH,  as  required  for 

physically  realizable  dural  materials  [9, 1 1]. 

While  the  above  two  sets  of  observations  lend  credence  to  the  reasonableness  of  the  effective 
constitutive  parameters  for  the  artificial  chiral  material,  it  is  also  important  to  note  their 
phenomenological  behavior  as  a  function  of  frequency.  Below  approximately  4  GHz,  the 
parameters  are  relatively  constant  and  P  remains  nonzero.  However,  even  with  this  sizable  value 
of  p,  the  optical  rotatory  dispersion  (ORD)  for  a  path  length  d  within  the  chiral  material 

0RDE7=--£ir$F  <°/OT)  •  (321) 

is  vanishingly  small  as  shown  in  Figure  3.5.  This  is  consistent  with  the  notion  that  the 
electromagnetic  effects  of  chirality  vanish  at  zero  frequency  [7, 12].  In  the  DBF  equations  (3.1), 
this  can  occur  even  for  non-vanishing  chirality  factor  p.  In  fact,  it  can  be  shown  using  a  Fourier 
transform  approach  as  in  [8,  p.  306]  that  p  is  an  even  function  of  frequency  for  lossless  materials 
and  hence  may  not  vanish  at  dc.  This  is  in  contrast  to  other  forms  of  the  constitutive  equations, 
such  as  the  Lindell-Sihvola  form  [12,  13],  which  necessarily  require  the  chirality  parameter  to 
vanish  at  zero  frequency  [12]. 
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Fig.  3.S.  Optical  rotary  dispersion  (ORD)  for  the  artificial  chiral  material  described  in  Figure 
3.4. 


2,3,2-C  ,  .Lossless  hasm4.ingMioasj£gMlts 

Shown  in  Figure  3.6  are  the  effective  material  parameters  when  the  host  is  a  lossless  dielectric 
having  er  hMt  =  2  with  inclusions  of  three-tum,  right-handed  helices  having  a  =  radius  =  0.6mm,  P 

-  pitch  =  0.3mm  and  b  =  wire  radius  =  0.05mm.  For  these  results,  40  helices  are  suspended  in  a 
spherical  volume  of  radius  r^h  =  5mm  yielding  a  number  density  N  =  7.6394  107  m'3  and  a  metal 
volume  concentration  of  0.68073%.  The  general  behavior  of  this  material  as  a  function  of 
frequency  is  similar  to  that  seen  for  the  free- space  host  As  expected,  the  resonant  frequency  of  the 
inclusion  is  reduced  by  the  factor  ^er  hoC .  For  frequencies  much  lower  than  this  resonant 

frequency,  it  is  noted  that  a  lossless  dielectric  host  produces  an  increase  in  the  effective  relative 
permittivity  proportional  to  the  host  permittivity  while  both  ft  and  n,  increase  only  slightly.  Hence, 
the  increase  in  ORD,  with  respect  to  a  free-space  host,  at  low  frequencies  and  small  fi  is 
proportional  to  £,>host. 

2,3,2,d  Lossy  host,  lossless  inclusions  results 

Figure  3.7  shows  these  material  parameters  for  the  wire  helices  of  the  same  dimensions  and 
number  density  mentioned  above,  except  with  a  host  having  erhott  =2-  jl.  A  notable,  and 

expected,  change  in  the  response  of  this  material,  as  compared  to  the  lossless  one,  is  the 
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Fig.  3.6.  Effective  chiral  media  parameters  for  a  random  distribution  of  three-turn  right-handed 
pec  helices  having  a  =  0.6,  P  =  0.3,  b  =  0.05  mm  and  N  =  7.6394  107/m3.  Forty 
helices  suspended  in  a  free-space  spherical  region  with  r^h  =  4.9mm  were  used  in  the 
simulation.  The  host  material  has  parameters  ^.  host  =  2,  |il  host  =  1. 
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Fig.  3.7.  Effective  chiral  media  parameters  for  a  random  distribution  of  three-tom  right-handed 
pec  helices  with  dimension  given  in  Fig.  3.6.  The  host  material  has  parameters 
fr.host  =  2-jl,  /ti,host  =  ^ 
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appearance  of  the  absorption  band  near  the  2/2  resonance  frequency  of  the  helix  (f  *  8GHz).  Such 
behavior  has  been  noted  before  in  connection  with  composite  chiral  materials  [4]-[6].  Within  the 
absorption  band  there  occurs  an  “anomalous  rotary  dispersion*’  where  the  polarization  tilt  angle 
(not  shown)  of  an  emerging  plane  wave  passing  through  a  slab  of  this  material  changes  sign  as  a 
function  of  frequency.  Similar  behavior  has  been  measured  for  chiral  composites  in  die  microwave 
frequency  bands  [7]. 


2.3,2,e  Free-snace  host,  lossy  inclusions  results 


The  final  data  to  be  presented  in  this  section  on  materials  modeling  is  for  helical  inclusions 
composed  of  lossy  wires,  i.eM  wires  that  have  finite  conductivity.  A  model  which  is  often  used  to 
describe  the  frequency  dispersion  in  the  material  parameters  is  the  single-resonance  Condon  model 
[13].  Since  for  the  DBF  equations  (3.1),  p  is  an  even  function  of  frequency  [1],  the  “Lorentz- 
type”  frequency  dependence  can  be  written  as 


A 

l-x2  +  jDx 


(3.22) 


where  A  is  the  amplitude,  D  is  a  damping  factor,  x  =  0)/o)Q  and  cq0  is  the  resonant  radian 
frequency.  By  defining  p  =  l 3'  +  jP"  it  can  be  shown  that 


O. 

H  1  +  (DJ  —  2)jc2  +xA 


and  P"  =  - 


ADx 

1  +  [D2  -  2jx2  +  xA  ' 


(3.23) 


The  three  constants  appearing  in  (3.23)  are  determined  by  fitting  the  curve  to  the  numerically 
obtained  “exact”  values  for  the  chirality  parameter  p. 

The  effective  chirality  parameter  for  a  three -turn  right-handed  helices  composed  of  lossy  wires 
are  shown  in  Figures  3.8  and  3.9.  In  Figure  3.8,  o  =  lf^S/m  while  in  Figure  3.9,  c  =  105  S/m. 
The  symbol-denoted  data  points  were  obtained  using  the  numerical  methodology  outlined  in  this 
report  together  with  the  iPSC/860.  The  solid  lines  are  the  Lorentz-dispersion  modeled  results.  It 
is  seen  that  the  agreement  is  quite  good  thus  confirming  the  frequency  dependence  of  this  synthetic 
chiral  materials  in  the  anomalous  dispersion  bands. 
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Fig.  3.8.  Effective  chirality  parameter  /?  for  a  random  distribution  of  three-turn  right-handed 
helices  (wire  conductivity  =  10s  S/m)  with  dimension  given  in  Fig.  3.6.  The  host 
material  has  parameters  f*,host  =  1.  *  1 . 
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Fig.  3.9.  Effective  chirality  parameter  for  a  random  distribution  of  three-turn  right-handed 
helices  (wire  conductivity  =  103  S/m)  with  dimension  given  in  Fig.  3.6.  The  host 
material  has  parameters  ^>host  =  1,  /i^host  ■  1. 
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